Probabilistic Paths for Protein Complex Inference
نویسندگان
چکیده
Understanding how individual proteins are organized into complexes and pathways is a significant current challenge. We introduce new algorithms to infer protein complexes by combining seed proteins with a confidenceweighted network. Two new stochastic methods use averaging over a probabilistic ensemble of networks, and the new deterministic method provides a deterministic ranking of prospective complex members. We compare the performance of these algorithms with three existing algorithms. We test algorithm performance using three weighted graphs: a naïve Bayes estimate of the probability of a direct and stable protein-protein interaction; a logistic regression estimate of the probability of a direct or indirect interaction; and a decision tree estimate of whether two proteins exist within a common protein complex. The best-performing algorithms in these trials are the new stochastic methods. The deterministic algorithm is significantly faster, whereas the stochastic algorithms are less sensitive to the weighting scheme.
منابع مشابه
Data-Driven Network Analysis and Applications
Data is critical for scientific research and engineering systems. However, data collection procedures are often subject to high cost or heavy loss rate. It is challenging to accurately estimate missing or unobserved data points through the available ones. To cope with this challenge, data interpolation methods have been utilized to approximate the missing data with lower price. In this thesis, ...
متن کاملPredicting protein complex membership using probabilistic network reliability.
Evidence for specific protein-protein interactions is increasingly available from both small- and large-scale studies, and can be viewed as a network. It has previously been noted that errors are frequent among large-scale studies, and that error frequency depends on the large-scale method used. Despite knowledge of the error-prone nature of interaction evidence, edges (connections) in this net...
متن کاملIncrementalizing MCMC in Probabilistic Programs Through Tracing and Slicing
Probabilistic programming enables high-level specification of complex probabilistic models without the need for hand-built inference techniques. We present a dynamic compilation technique for increasing the efficiency of Markov Chain Monte Carlo (MCMC) inference on probabilistic programs. We focus on Church, a probabilistic extension of untyped call-by-value lambda calculus. MCMC in this settin...
متن کاملExtending the Qualitative Trajectory Calculus Based on the Concept of Accessibility of Moving Objects in the Paths
Qualitative spatial representation and reasoning are among the important capabilities in intelligent geospatial information system development. Although a large contribution to the study of moving objects has been attributed to the quantitative use and analysis of data, such calculations are ineffective when there is little inaccurate data on position and geometry or when explicitly explaining ...
متن کاملInferring propagation paths for sparsely observed perturbations on complex networks
In a complex system, perturbations propagate by following paths on the network of interactions among the system's units. In contrast to what happens with the spreading of epidemics, observations of general perturbations are often very sparse in time (there is a single observation of the perturbed system) and in "space" (only a few perturbed and unperturbed units are observed). A major challenge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006